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ABSTRACT The evolutionary origins of Middle East respiratory syndrome (MERS) 
coronavirus (MERS-CoV) are unknown. Current evidence suggests that insectivorous 
bats are likely to be the original source, as several 2c CoVs have been described 
from various species in the family Vespertilionidae. Here, we describe a MERS-like 
CoV identified from a Pipistrellus cf. hesperidus bat sampled in Uganda (strain 
PREDICT/PDF-2180), further supporting the hypothesis that bats are the evolutionary 
source of MERS-CoV. Phylogenetic analysis showed that PREDICT/PDF-2180 is closely 
related to MERS-CoV across much of its genome, consistent with a common ances- 
try; however, the spike protein was highly divergent (46% amino acid identity), sug- 
gesting that the two viruses may have different receptor binding properties. Indeed, 
several amino acid substitutions were identified in key binding residues that were 
predicted to block PREDICT/PDF-2180 from attaching to the MERS-CoV DPP4 recep- 
tor. To experimentally test this hypothesis, an infectious MERS-CoV clone expressing 
the PREDICT/PDF-2180 spike protein was generated. Recombinant viruses derived 
from the clone were replication competent but unable to spread and establish new 
infections in Vero cells or primary human airway epithelial cells. Our findings sug- 
gest that PREDICT/PDF-2180 is unlikely to pose a zoonotic threat. Recombination in 
the S1 subunit of the spike gene was identified as the primary mechanism driving 
variation in the spike phenotype and was likely one of the critical steps in the evolu- 
tion and emergence of MERS-CoV in humans. 


IMPORTANCE Global surveillance efforts for undiscovered viruses are an important 
component of pandemic prevention initiatives. These surveys can be useful for find- 
ing novel viruses and for gaining insights into the ecological and evolutionary fac- 
tors driving viral diversity; however, finding a viral sequence is not sufficient to de- 
termine whether it can infect people (ie., poses a zoonotic threat). Here, we 
investigated the specific zoonotic risk of a MERS-like coronavirus (PREDICT/PDF-2180) 
identified in a bat from Uganda and showed that, despite being closely related to 
MERS-CoV, it is unlikely to pose a threat to humans. We suggest that this approach 
constitutes an appropriate strategy for beginning to determine the zoonotic poten- 
tial of wildlife viruses. By showing that PREDICT/PDF-2180 does not infect cells that 
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express the functional receptor for MERS-CoV, we further show that recombination 
was likely to be the critical step that allowed MERS to emerge in humans. 
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; n 2012, Middle East respiratory syndrome (MERS) emerged in Saudi Arabia. Clusters 
Lot fatal pneumonia in adults were determined to be caused by a novel lineage C 
betacoronavirus (2c CoV), termed MERS-CoV (1). This was the first 2c CoV known to 
cause disease in humans and at the time of its discovery was most closely related to 
two known bat coronaviruses (2), raising the possibility that bats were a reservoir and 
source for the virus. Concurrently, epidemiologists identified an association between 
MERS infections in patients and their contact with dromedary camels (3, 4). MERS-CoV 
was subsequently detected in camels at a farm linked to two human cases in Qatar (5) 
and in camels in Egypt (6), followed by surveys that demonstrated widespread expo- 
sure to the virus in the Middle East and in North and East Africa as early as the 1980s 
(7-10). It is now clear that camels play an important role in the transmission of 
MERS-CoV to people (11), with seroprevalence highest among those who have had 
contact with camels (12). 

While camels are thought to be important for the transmission of MERS-CoV, bats 
are widely considered to be the evolutionary source of the virus. Several 2c CoVs have 
now been described in bats, including HKU4 from Tylonycteris pachypus (13), HKU5 from 
Pipistrellus abramus (13), and the recently identified NeoCoV from Neoromicia capensis 
(14). NeoCoV is the closest relative yet discovered (85% identical to MERS) and shares 
sufficient genetic similarity in the replicase genes to be considered part of the same 
viral species (15); however, despite being closely related across much of the genome, 
the S1 subunit of the spike gene is highly divergent as a result of a prior recombination 
event. Recombination in the spike gene is particularly significant because the derived 
protein is responsible for host receptor recognition and membrane fusion (16) and thus 
is central in determining host specificity. The S1 subunit contains the receptor binding 
domain and therefore has a specific role in defining host tropism (17). Other processes 
are also important, such as the activation of the spike protein by host proteases (18), 
but the ability of S1 to bind with a host receptor is a critical step in the emergence 
pathway—and it can be quickly altered by a single recombination event. The sequence 
variation in the $1 region of MERS-CoV and NeoCoV could therefore indicate differences 
in host binding preferences. 

Predicting the interactions of virus binding domains with a particular host receptor 
(for example, the human MERS-CoV receptor DPP4) is possible through the use of 
structural modeling and the generation of infectious clones. Protein-protein interac- 
tions can be modeled using a related homologous complex (19, 20) while reverse 
genetic strategies can test the permissiveness of human or other primate cells for 
infectious clones expressing the novel receptor binding domains or complete spike 
glycoprotein (21-24). Pseudotyped lentivirus systems have also been used, for example, 
to show that DPP4 is the receptor for HKU4 but not for the closely related HKU5 (25, 26). 
And while pseudotypes are not always accurate predictors of spike glycoprotein 
function (23), these findings indicate that multiple cell-entry strategies could exist for 
2c viruses and that not all MERS-like CoVs pose an equal risk of zoonotic emergence. 

Here, we investigated the receptor binding properties of a new strain of MERS-like 
CoV found in a bat from Uganda. This virus (PREDICT/PDF-2180) shares the same 
putative $1 subunit recombination that was observed in NeoCoV, allowing us to also 
consider whether the spike recombination was critical for the emergence of MERS-CoV 
in humans. 


RESULTS 

Sampling and site characterization. A bat (identifier [ID] OTBA03-20130220) was 
trapped on 20 February 2013 in the Nkuringo area of Kisoro District, in southwestern 
Uganda (latitude —1.12, longitude 29.68) (Fig. 1). This area is an established settlement 
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Bat sampled in Kisoro 
* District, Southwest Uganda 


C] Distribution of Pipistrellus hesperidus 


FIG 1. Map showing the distribution of Pipistrellus hesperidus (based on International Union for 
Conservation of Nature [IUCN] data) and the location of the bat sampled for the study. 


of villages comprising approximately 15,000 inhabitants adjacent to the southwestern 
boundary of Bwindi Impenetrable National Park. Communities include subsistence 
farmers growing small crops, with some members working inside the national park or 
supporting tourism-related businesses. Livestock, including cattle, pigs, sheep, goats, 
and poultry, are present in the village and are raised on a small scale primarily for local 
consumption. 

The sampled bat weighed 3.0 g and had a forearm length of 25 mm (Fig. 1). It was 
identified as Pipistrellus cf. hesperidus based on 95% sequence identity in the cyto- 
chrome 6 (Cytb) gene. The cytochrome oxidase subunit 1 (CO1) was also sequenced, 
but no corresponding CO1 sequences for P. hesperidus were available in GenBank for 
comparison. We therefore relied on the Cytb sequence for species identification. 

Discovery and sequence characterization. The oral swab, rectal swab, and whole 
blood of bat OTBA03-20130220 were assayed for the presence of coronavirus by 
consensus PCR (cPCR). Two separate assays were used, each targeting a different region 
of the ORF1b RNA-dependent RNA polymerase (RdRp). Bands of the expected size were 
amplified from the rectal swab (PDF-2180) by both assays and confirmed to represent 
viral products by traditional Sanger dideoxy sequencing. Both fragments showed 
>98% amino acid sequence identity to MERS-CoV, prompting further characterization 
of the virus. The oral swab and blood were negative. 

The near-full-length genome (identified as PREDICT/PDF-2180) was assembled from 
100-nucleotide (nt) Illumina single-end reads at an average depth of 26x. Only the 5’ 
and 3’ noncoding regions were left incomplete. The order of all predicted open reading 
frames (ORFs) was consistent with MERS-CoV and with the recently described NeoCoV 
(KC869678) identified in a bat from South Africa. Similarly, the hexanucleotide tran- 
scription regulatory sequence (AACGAA) was conserved and found in the same position 
as both MERS and NeoCoV upstream of each predicted ORF. 

Across the full genome, the sequence had 86.5% amino acid identity to MERS-CoV 
and 91% to NeoCoV; however, considerable variation was observed in different genes. 
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TABLE 1 Pairwise amino acid sequence identity of subunit 1 of spike protein of 2c CoVs 
% identity for subunit 1 (% identity for receptor binding domain) 


Accession no. PDF- EriCoV/ _—_EriCoV/ BtCoV/ 


and/or isolate NeoCoV 2180 2012/174 2012/216 133 HKU4 HKU5-1 HKU5-5 


mBio’ 


NRC- 


Hasa1 HKU205 


KC869678, 
NeoCoV 
Predict/ 91.0 (93.1) 
PDF-2180 
KC545383, 54.7 (59.7) 54.3 (60.4 
EriCoV/2012/ 
174 
KC545386, 54.9 (59.7) 54.4 (60.4) 99.9 (100) 
EriCoV/2012/ 
216 
DQ648794, 45.1 (42.5) 45.5 (42.5) 43.9 (50.0) 44.0 (50.0 
BtCoV/133 
EF065505, 45.1 (41.8) 45.7 (41.8) 44.3 (50.0) 44.4 (50.0) 95.6 (96.2 
BatCoV 
HKU4 
EF065509, 47.8 (46.3) 47.7 (47.0) 45.8 (53.7) 45.9 (53.7) 57.7 (63.4) 58.4 (63.4 
BatCoV 
HKUS5-1 
EF065512, 474 (45.5) 47.8 (46.3) 44.6 (50.7) 44.7 (50.7) 56.3 (61.8) 56.7 (61.8) 90.1 (93.9 
BatCoV 
HKU5-5 


KJ473821, 45.5 (42.5) 46.2 (44.0) 45.8 (55.2) 45.9 (55.2) 59.7 (57.5) 59.7 (56.7) 63.7 (73.9) 64.1 (73.1) 


$2013 


KC667074, 43.5 (40.3) 44.4 (41.0) 43.9 (47.8) 44.1 (47.8) 61.2 (64.9) 61.1 (63.4) 56.5 (63.4) 56.7 (61.1) 61.4 (61.2) 


EMC-2012 


KF 186567, 43.5 (40.3) 44.4 (41.0) 43.9 (47.8) 44.1 (47.8) 61.4 (65.6) 61.2 (64.1) 56.5 (63.4) 56.7 (61.1) 61.4 (61.2) 99.9 (99.2) 


Al-Hasa1 


KJ477102, 43.7 (39.6) 44.4 (40.3) 43.8 (47.0) 43.9 (47.0) 61.1 (64.9) 61.0 (63.4) 56.3 (62.6) 56.4 (60.3) 61.4 (60.4) 99.1 (98.5) 99.2 (99.2) 


NRC-HKU205 


Amino acid identity could be as high as 97% to both MERS-CoV and NeoCoV in ORF1b 
or as low as 45% to MERS-CoV in subunit 1 of the spike protein. For the full spike 
protein, identity was 94% to NeoCoV and 63% to MERS-CoV. Percent sequence identity 
of the spike protein (subunits 1 and 2) to other 2c viruses is shown in Tables 1 and 2, 
respectively. Based on the current criteria for species demarcation established by the 
International Committee for the Taxonomy of Viruses (>90% amino acid sequence 
identity in the replicase proteins), PREDICT/PDF-2180 shares sufficient genetic identity 
to MERS-CoV to be considered a member of the MERS-like Coronavirus species. 

Phylogenetic analysis. Maximum likelihood phylogenetic reconstructions showed 
that PREDICT/PDF-2180 is most closely related to NeoCoV (Fig. 2). The two viruses were 
basal or formed sister clades to MERS-CoV in all genes except subunit 1 of the spike. The 
full-genome alignment was scanned for recombination using seven different algo- 
rithms (RDP, GENECONV, Bootscan, MaxChi, Chimaera, SiScan, and 3seq) implemented 
in RDP v4.46. A single recombination event was detected within the spike gene by RDP, 
Bootscan, MaxChi, Chimaera, SiScan, and 3seq (Bonferroni-corrected P of <<0.001), 
suggesting that the incongruent phylogenies observed between spike subunit 1 and 
the rest of the genome are the result of recombination. Attempts to date the diver- 
gence of these two viruses to estimate the “minimum” number of years since this 
recombination were prevented by evidence of strong negative (purifying) selection 
across the genome (Fig. 2). Given that purifying selection can confound true phyloge- 
netic depth, we felt that attempts to estimate the number of years to common ancestry 
were inappropriate and would result in artificially “recent” dates. 

Zoonotic potential of PREDICT/PDF-2180. The high genetic variability in subunit 
1 suggests that human and bat strains of MERS have different receptor binding 
properties. To investigate this, we modeled the specific affinity of the PREDICT/PDF- 
2180 spike protein for the human MERS-CoV receptor DPP4 (27). We utilized the crystal 
structure of the MERS-CoV spike binding domain in complex with DPP4 to create a 
homology model for the comparable region of the PREDICT/PDF-2180 spike (Fig. 3). 
Previous work has demonstrated 11 specific amino acid residues in MERS-CoV that 


March/April 2017 Volume 8 Issue 2 e00373-17 


mbio.asm.org 4 


Hio'wseoiqu Aq paysiiqnd - Z10Z ‘Lg Judy uo Bio'wseolqu Wo papeojumoqg 


Uganda MERS-Like Virus in Bats mBio" 
TABLE 2 Pairwise amino acid sequence identity of subunit 2 of spike protein of 2c CoVs 
% identity 
Accession no. EMcC- NRC- PDF-  EriCoV/ EriCoV/ BtCoV/ 
and/or isolate Al-Hasa1 2012 HKU205 NeoCoV 2180 2012/174 2012/216 133 HKU4 HKU5-1 HKU5-5 SC2013 
KF186567, 
Al-Hasa1 
KC667074, 99.7 
EMC-2012 
KJ477102, 98.6 98.7 
NRC-HKU205 
KC869678, 84.7 84.9 84.4 
NeoCoV 
Predict/ 85.5 85.7 84.9 97.6 
PDF-2180 
KC545383, 67.8 67.8 67.8 70.2 70.4 
EriCoV/2012/174 
KC545386, 68.3 68.3 68.3 70.5 70.7 96.9 
EriCoV/2012/216 
DQ648794, 72.9 72.9 72.5 137 733 69.0 68.2 
BtCoV/133 
EF065505, 73.0 73.0 72.7 73.8 73.5 69.0 68.3 98.4 
BatCoV HKU4 
EF065509, 71.2 71.2 70.9 74.3 73.6 66.9 66.6 79.6 79.1 
BatCoV HKU5-1 
EF065512, 717 717 714 74.5 73.9 66.9 66.6 79.9 79.4 97.9 
BatCoV HKU5-5 
KJ473821, 73.5 73.5 73.2 125 71.7 67.6 66.8 81.0 80.5 81.8 82.0 
$C2013 


facilitate binding interactions with the human DPP4 (28). Of these residues, only one is 
conserved for PREDICT/PDF-2180. To determine whether the binding interactions may 
be conserved between DPP4 and PREDICT/PDF-2180 regardless of the differences in 
amino acid residues at these positions, we analyzed the predicted interactions between 
PREDICT/PDF-2180 and DPP4, compared to MERS-CoV and DPP4. Overall, we found a 
global reduction in predicted hydrogen bonding interactions in the DPP4-PREDICT/ 
PDF-2180 binding interface compared with DPP4-MERS-CovV (Fig. 3). While the inter- 
actions in conserved residue Y499 were maintained, DPP4 interactions with PREDICT/ 
PDF-2180 residues 501, 502, 510, 511, 513, 539, and 542 were disrupted. The interaction 
between DPP4 Y322 and MERS D510 is abolished in the PREDICT/PDF-2180 prediction, 
where D510 is replaced by K510. This is a charge change from negative to positive. 
Interestingly, a change from R511 in MERS to D511 in PREDICT/PDF-2180 facilitates a 
potential interaction with Y322 to replace the hydrogen bond lost with K510. Regard- 
less, due to the predicted loss of the majority of the DPP4 binding interactions, the 
model predicts that PREDICT/PDF-2180 will not bind to DPP4. 

To confirm these results in vitro, a recombinant MERS-CoV cDNA clone was con- 
structed containing the PREDICT/PDF-2180 spike gene in the context of the full-length 
MERS-CoV backbone. The chimeric virus maintains the entire ectodomain of the 
PREDICT/PDF-2180 spike with the exception of the first 20 amino acids of the 5’ end, 
which were taken from wild-type MERS-CoV. Similarly, the transmembrane domains 
(TMDs) and cytoplasmic tail of the chimeric virus used the wild-type MERS-CoV se- 
quence in order to minimize incompatibility in virion formation. Following transfection 
into Vero cells, PCR amplification of leader-containing transcripts for all of the expected 
nested subgenomic (sg) MRNAs (including the sg spike mRNA) confirmed replication of 
the recombinant virus (Fig. 4). However, subsequent passages by supernatant transfer 
to uninfected monolayers failed to reproduce the infection, suggesting that the 
PREDICT/PDF-2180 spike protein is unable to mediate cell entry in Vero cells as seen 
with wild-type MERS-CoV (Fig. 4). 
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i. ORF1a 


w = 0.16 


KF186567 Al-Hasa1 
KC667074 EMC-2012 
KJ477102 NRC-HKU205 
KC869678 NeoCoV 
PREDICT/PDF-2180 
KJ473821 SC2013 
DQ648794 BtCoV/133 
EF065505 BatCoV HKU4 
EF065509 BatCoV HKUS-1 
EF065512 BatCoV HKU5-5 
KC545383 EriCoV/2012/174 
KC545386 EriCoV/2012/216 
NC_005147 OC43 


0.0 


KC869678 NeoCoV 
PREDICT/PDF-2180 
KC545383 EriCoV/2012/174 
KC545386 EriCoV/2012/216 
DQ648794 BtCoV/133 

EF065505 BatCoV HKU4 
EF065509 BatCoV HKU5-1 
EF065512 BatCoV HKU5-5 
KJ473821 SC2013 

KC667074 EMC-2012 
KF186567 Al-Hasa1 

KJ477102 NRC-HKU205 
NC_005147 OC43 


ili. Spike-1 


w=0.2 


0.0 


KF186567 Al-Hasa1 
KC667074 EMC-2012 
KJ477102 NRC-HKU205 
KC869678 NeoCoV 
PREDICT/PDF-2180 

KJ473821 SC2013 

KC545386 EriCoV/2012/216 
KC545383 EriCoV/2012/174 
EF065505 BatCoV HKU4 
DQ648794 BtCoV/133 
EF065512 BatCoV HKU5-5 
EF065509 BatCoV HKU5-1 
NC_005147 OC43 


0.0 


KF186567 Al-Hasa1 
KJ477102 NRC-HKU205 
KC667074 EMC-2012 
KC869678 NeoCov 
PREDICT/PDF-2180 
KJ473821 SC2013 
DQ648794 BtCoV/133 
EF065505 BatCoV HKU4 
EF065509 BatCoV HKU5-1 


vii. N 


w=0.2 


KC545383 EriCoV/2012/174 
KC545386 EriCoV/2012/216 
NC_005147 O0C43 


0.0 


EF065512 BatCoV HKU5-5 


mBio’ 


ii, ORF 1b 


w = 0.06 


KF186567 Al-Hasa1 
KC667074 EMC-2012 
KJ477102 NRC-HKU205 
KC869678 NeoCoVv 
PREDICT/PDF-2180 
KJ473821 SC2013 
DQ648794 BtCoV/133 
EF065505 BatCoV HKU4 
EF065509 BatCoV HKU5-1 


KC545383 EriCoV/2012/174 
KC545386 EriCoV/2012/216 
NC_005147 0C43 


0.0 


KF 186567 Al-Hasal 
KC667074 EMC-2012 
KJ477102 NRC-HKU205 
KC869678 NeoCoV 
PREDICT/PDF-2180 
KC545383 EriCoV/2012/174 
KC545386 EriCoV/2012/216 
DQ648794 BtCoV/133 
EF065505 BatCoV HKU4 
EF065509 BatCoV HKU5-1 
EF065512 BatCoV HKU5-5 
KJ473821 SC2013 
NC_005147 O0C43 


iv. Spike-2 


w=0.12 


0.0 


KF186567 Al-Hasal 
KC667074 EMC-2012 
KJ477102 NRC-HKU205 
KC869678 NeoCoVv 
PREDICT/PDF-2180 
KJ473821 SC2013 
EF065509 BatCoV HKU5-1 
EF065512 BatCoV HKU5-5 
DQ648794 BtCoV/133 
EF065505 BatCoV HKU4 
KC545383 EriCoV/2012/174 
KC545386 EriCoV/2012/216 
NC_005147 OC43 


vi. M 


w=0.01 


0.0 


KF 186567 Al-Hasal 
KC667074 EMC-2012 
KJ477102 NRC-HKU205 
KC869678 NeoCoV 
PREDICT/PDF-2180 
KJ473821 SC2013 
DQ648794 BtCoV/133 
EF065505 BatCoV HKU4 
EF065509 BatCoV HKU5-1 
EF065512 BatCoV HKU5-5 
KC545383 EriCoV/2012/174 
KC545386 EriCoV/2012/216 
NC_005147 O0C43 


viii. Full Genome 


0.0 


FIG 2 PREDICT/PDF-2180 and NeoCoV are ancestral to MERS-CoV. Maximum likelihood phylogenetic reconstructions of 2c coronaviruses (nucleotide) show that 
PREDICT/PDF-2180 and NeoCoV are consistently basal to, or form sister clades with, MERS-like CoV (human/camel strains), except in subunit 1 of the spike 
protein. Human OC43 is the outgroup. All genes were shown to be under purifying selection (a). 


Supernatant from the transfected Vero cells (passage 0 [PO]) was also used to infect 
primary human airway epithelial (HAE) cells, which were derived from lung donors with 
no preexisting chronic disease. These well-differentiated primary cells are grown on an 
air-liquid interface and represent an important model for viral infection of the human 
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FIG 3 The spike protein of PREDICT/PDF-2180 is highly divergent. (A) A nucleotide identity Simplot 
shows that PREDICT/PDF-2180 and NeoCoV are closely related to MERS-CoV across much of the genome 
but are highly divergent in subunit 1 of the spike protein, suggesting that they may have different 
receptor binding properties. (B) Variation in key amino acid binding residues (*) and modeling to human 
DPP4 both suggest that PREDICT/PDF-2180 is unable to bind to DPP4. 


lung. Several coronaviruses show improved replication in these polarized primary 
respiratory cells compared to standard cell lines. Using wild-type MERS-CoV as a 
control, primary HAE cell cultures were infected with passage 0 from the PREDICT/PDF- 
2180-MERS chimeric clone and showed no evidence of viral replication (Fig. 5A). 
Similarly, viral RNA expression analysis indicated no evidence of replication following 
infection with the PREDICT/PDF-2180 chimeric virus (Fig. 5B). In contrast, wild-type 
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FIG 4 Uganda spike protein does not permit entry into Vero cells. (A) Genome organization of MERS-CoV encoding the 
Uganda spike glycoprotein. (Bi) Reverse transcription-PCR detection of leader-containing nested subgenomic mRNAs 
encoding the nucleocapsid transcript, E transcript, and ORF5 and ORF4a transcripts (p0, RNA-transfected cells; p1, passage 
1; p2, passage 2). (Bii) Reverse transcription-PCR amplification of leader-containing mRNA 2 containing the Uganda S gene. 
Note the loss of the leader-containing transcripts in p1 and p2, demonstrating the loss of infectivity associated with 
insertion of the Uganda S gene. Ladder, 1 kb. 


MERS-CoV induces robust replication as measured by plaque assay and viral-leader- 
containing transcripts. Together, the results indicate that the PREDICT/PDF-2180 spike 
is not likely to efficiently replicate in the human airway without further adaptation. 


DISCUSSION 

The discovery of PREDICT/PDF-2180 in Uganda adds to the growing number of 
group C betacoronaviruses that have now been identified in bats. These include 
NeoCoV from South Africa (15), Mex_CoV-9 from Mexico (29), BatCoV/KW2E from 
Thailand (30), P.pipi/VM314 from the Netherlands (31), H.sav/206645-40 from Italy (32), 
and BetaCoV/SC2013, HKU4, and HKUS, all from China (33). Collectively, these examples 
demonstrate that the MERS-related CoVs are highly associated with bats and are 
geographically widespread. 
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FIG 5 PDF-2180 spike unable to mediate infection of primary human airway cultures. (A) Primary human 
airway epithelial (HAE) cells grown on an air-liquid interface were infected with wild-type MERS-CoV 
(black bars) or passage 0 of PDF-2180/MERS chimeric CoV (red bars) and assayed by plaque assay on Vero 
cells. ND, none detected. (B) Reverse transcription-PCR detection of leader-containing nested sub- 
genomic mRNAs encoding the nucleocapsid transcript, E transcript, and ORF5 and ORF4a transcripts 
following infection. Ladder, 1 kb; WT, wild type. 
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The group 2c viruses appear to have a particular, though not exclusive, association 
with vespertilionid bats, which form a highly diverse and widely distributed family 
within the Microchiroptera. NeoCoV, SC2013, HKU4, HKU5, H.sav/206645-40, P.pipi/ 
VM314, and PREDICT/PDF-2180 were all found in species belonging to this family. If the 
full diversity of 2c viruses reflects the number of vespertilionid species described (n = 
475 species), there is potential for a substantial diversity of MERS-related viruses to be 
circulating in bats. 

Our data suggest that PREDICT/PDF-2180 cannot infect humans and is not likely to 
pose a threat to human health, at least in its current form. The spike protein of this virus 
is distinct from the MERS-CoV spike, sharing only 46% amino acid identity, and it 
appears unable to enter cells that express the functional receptor used by MERS-CoV 
(DPP4)— or any other receptor expressed by either primate Vero cells or human airway 
epithelial cells. Importantly, failure to assemble and release viral particles from the 
initial infection could also explain our results; however, we suggest that receptor 
incompatibility is more likely given the steps taken to minimize particle disruption (see 
Materials and Methods). These results suggest that adaptation of the spike would be 
required to permit PREDICT/PDF-2180 replication in human airways. While we did not 
examine the specific binding properties of the related virus NeoCoV, the high amino 
acid sequence identity with PREDICT/PDF-2180 indicates that it shares a similar phe- 
notype and is most likely also refractory for human infections. 

Our data suggest that RNA recombination is the mechanism that underlies the 
observed difference in receptor binding. Recombination can occur at high frequency 
during mixed coronavirus infection, allowing different viral lineages to exchange 
specific functional motifs or even entire genes (22, 34, 35). Phylogenetic incongruence 
was noted in subunit 1 of the spike protein, and breakpoints were observed in this 
same region by multiple recombination detection algorithms. It is also parsimonious 
with the high purifying selection observed across the genome of 2c viruses (which 
argues against receptor adaptation via drift or selection) and with previous reports 
citing recombination in association with host switching for other coronaviruses (36-38). 
Given that the recombination is observed in both PREDICT/PDF-2180 and NeoCoV, we 
support the previous suggestion by Corman et al. (15) that it was the MERS-CoV that 
acquired a new spike. Given also that the PREDICT/PDF-2180 spike does not use DPP4 
and is seemingly not competent for human infection, we further suggest that the 
recombination event was the critical factor driving the emergence of MERS-CoV. 

What is less clear is whether this recombination occurred in bats or an intermediate 
host. Lineage 2c strains that use DPP4 have been reported in bats (25, 26), and there 
is also evidence of positive selection in the bat DPP4 that would indicate the existence 
of a large diversity of (as-yet-unknown) DPP4-competent strains (39). Just as detailed 
metagenomics studies have revealed the presence of several severe acute respiratory 
syndrome (SARS)-like bat CoVs that can use the human angiotensin converting enzyme 
2 receptor and/or replicate efficiently in human cells (23, 24, 40-42), it seems likely that 
subsets of diverse MERS-CoV-like bat coronaviruses will also exist which are prepro- 
grammed to efficiently use the human DPP4 receptor. This would support the hypoth- 
esis that the recombination occurred in bats; however, the MERS-CoV spike seems to 
have adapted and acquired a preference for human DPP4 over the bat homologue (26, 
43) making it difficult to conclude with certainty that the MERS-CoV spike has bat 
origins. Increased surveillance will be required to understand the full diversity of spike 
phenotypes circulating in bats or in intermediate hosts such as camels. 

In recent years, global surveillance efforts such as the USAID Emerging Pandemic 
Threats PREDICT program have advanced our understanding of the viral diversity that 
exists in wildlife (44). While this knowledge can be useful for proving the existence of 
novel viruses (29, 30, 45-49), quantifying overall viral diversity (45, 46), and measuring 
infection prevalence within a population, it does not provide information on their 
specific zoonotic threat. Given that no single correlate of pathogenicity or virulence has 
been determined for any viral family (50, 51) and that it is not possible to determine risk 
through phylogenetic data alone (51), the approach used here is an important tool in 
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characterizing the zoonotic potential of viral sequences detected in wildlife. Doing so 
on a large scale (for example, as part of projects like USAID PREDICT) will also provide 
critical information on host and geographic variation in key viral traits, like potential 
host tropism, which are currently missing from most risk-based models forecasting hot 
spots of disease emergence. 


MATERIALS AND METHODS 


Sampling. A bat (ID OTBA03-20130220) was trapped on 20 February 2013 in the Nkuringo area of 
Kisoro District in southwestern Uganda. The bat was caught with a mist net (3.8-mm mesh; Avinet, Inc.) 
according to established protocols and was released unharmed postsampling. Standard morphometric 
measurements (weight and forearm length) and photographs were obtained to aid species identification, 
which was confirmed by DNA barcoding of the cytochrome b (Cytb) and cytochrome oxidase subunit 1 
(CO1) mitochondrial DNA genes (52). Approximately 200 pul of whole blood was collected into EDTA. Oral 
and rectal swabs were also collected in duplicate (one into viral transport medium and one dry). 
Specimens were stored temporarily on gel packs and frozen in liquid nitrogen in the field within 4 h of 
collection and then transferred to —80°C for storage until testing. Samples were transferred to the Center 
for Infection and Immunity at Columbia University for viral discovery and characterization. 

Coronavirus discovery by consensus PCR. Total nucleic acid (TNA) was extracted using the Roche 
MagNA Pure 96 platform according to the manufacturer's instructions. TNA was reverse transcribed into 
cDNA using SuperScript Ill (Invitrogen) according to the manufacturer's instructions. Two broadly 
reactive consensus PCR assays targeting partial and nonoverlapping regions of the coronavirus ORF1b 
(containing the RdRp) were performed (53, 54). Bands of the expected size were excised from 1% 
agarose, cloned into Strataclone PCR cloning vector, and sequenced to confirm detection. 

Sequencing and bioinformatic processing. Total RNA extract was DNase treated (DNase |; Ambion, 
Life Technologies, Inc.) and reverse transcribed using SuperScript III (Invitrogen, Life Technologies, Inc.) 
with random hexamer primers. The cDNA was RNase H treated before second-strand synthesis with 
Klenow fragment (3’ to 5’ exonuclease) (New England Biolabs). The resulting double-stranded cDNA was 
sheared to 200-bp (average) fragments using a Covaris focused ultrasonicator E210, according to the 
manufacturer's standard settings, and used for library construction using the Kapa Hyper library 
preparation kit (Kapa Biosystems, Roche), again according to the manufacturer's instructions. The final 
library was quantified using an Agilent Bioanalyzer 2100 and pooled to allocate 20 million reads on the 
Illumina HiSeq 2500 platform. 

The Q30-filtered FastQ files were used to generate quality control reports using PRINSEQ software 
(v0.20.2) (55) and were further filtered and trimmed. Host background levels were determined by 
mapping the filtered reads against a bat reference database using Bowtie2 mapper (v2.0.6, http://bowtie 
-bio.sourceforge.net) (56). The host-subtracted reads were de novo assembled using MIRA assembler 
(v4.0) (57). Contigs and unique singletons were subjected to homology search using MegaBlast against 
the GenBank nucleotide database. Sequences that showed poor or no homology at the nucleotide level 
were screened by BLASTX against the viral GenBank protein database. Viral sequences from BLASTX 
analysis were subjected to another round of BLASTX homology search against the entire GenBank 
protein database to correct for biased E values and taxonomic misassignments. The genome of 
PREDICT/PDF-2180 was mapped with Bowtie2 against the filtered data set to visualize depth and 
coverage in Integrated Genomics Viewer. 

Genetic and phylogenetic analyses. Sequences were analyzed and edited using Geneious (version 
6.0.3). Full genome and individual gene sequences were aligned with ClustalW, and maximum likelihood 
phylogenetic trees were constructed in PAUP* (500 bootstraps). Models of nucleotide substitution were 
selected using jModelTest. Nucleotide sequence similarity between MERS-like viruses was assessed using 
Simplot v3.5.1 (58) with a sliding window size of 500 bp, a step size of 50 nucleotides, and 1,000 
bootstrap replicates using gap-stripped alignments and the F84 (maximum likelihood) distance model. 
The full-genome alignment was scanned for recombination using seven different algorithms (RDP, 
GENECONYV, Bootscan, MaxChi, Chimaera, SiScan, and 3seq) implemented in RDP (v4.46) (59). 

Structural modeling. Predicted binding differences between DPP4 and either MERS or Uganda were 
determined by structural analysis. The crystal structure demonstrating the interactions between DPP4 
and MERS spike binding domain has previously been reported (28), and the crystal structure is PDB ID 
AKRO. We created a homology model of the region of the Uganda spike protein homologous to the MERS 
spike binding domain based on the 4KRO structure in association with DPP4. We first aligned the amino 
acid sequences for 4KRO (28) and the Uganda spike using Clustal Omega (60). We then used MODELLER 
(61) to create predicted structural coordinates for the Uganda spike based on the coordinates of 4KRO. 
Because MODELLER requires the two sequences to be the same length, we introduced gaps in the 
sequences where appropriate to maintain the best sequence identity between the 2 amino acid 
sequences. Numbering is based on MERS-CoV amino acid residues. We then imported the predicted 
crystal structure for Uganda and the known DPP4-MERS structure into PyMOL (62) for visualization and 
comparative analysis. Hydrogen bonding interactions were predicted by selecting the known DPP4 and 
A4KRO or the homologous DPP4 and Uganda interaction sites and using the “find polar interactions” 
function within PyMOL. 

Generation of a MERS-CoV recombinant virus. Previously, we reported the isolation of recombi- 
nant MERS-CoV that was derived from a cDNA clone (63). To reconstitute a MERS genome expressing the 
PREDICT/PDF-2180 CoV spike, new E and F plasmids were ordered synthetically (Bio-Basic) to contain the 
PREDICT/PDF-2180 spike ectodomain; these plasmids were named MERS-Uganda E and F. MERS ORF1 
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and ORF2 overlap, so to maintain a functional replicase sequence and signal sequence for spike, the first 
20 amino acids of the MERS spike were retained and the PREDICT/PDF-2180 sequence was fused in frame 
downstream of the MERS-CoV S glycoprotein signal peptidase domain beginning at its 24th amino acid. 
In short, the sequence of the MERS spike coding for amino acids 21 to 1306 was replaced with the 
sequence of the PREDICT/PDF-2180 spike coding for amino acids 24 to 1298, so that following process- 
ing, an intact spike glycoprotein was expressed during virus infection. The E and F plasmids were 
sequence verified prior to the assembly of full-length recombinant DNAs. 

The MERS A through F inserts (containing the Uganda S gene) were restriction digested, resolved on 
0.8% agarose gels, visualized, excised, and purified using a QlAquick gel extraction kit (Qiagen). The MERS 
A to F inserts were mixed and ligated overnight at 4°C, phenol-chloroform extracted, and precipitated 
under isopropyl alcohol. Full-length T7 transcripts were generated in vitro as described by the manu- 
facturer (Ambion; mMessage mMachine) with certain modifications (63). For MERS-CoV N transcripts, 
1 yg of plasmid DNA containing the N gene (amplified using forward primer 5’-ATTTAGGTGACACTAT 
AGATGGCATCCCCTGCTGCACC-3’ and reverse primer 5’-TTTTTTTTTTTTTTIT TTT TTT CTAATCAGTGTTAACA 
TCAATCATTGG-3’) was transcribed by SP6 RNA polymerase with a 4:1 ratio of cap analog to GTP. RNA 
transcripts were added to 800 yl of Vero cell suspension (8.0 X 10° cells) in an electroporation cuvette, 
and four electrical pulses of 450 V at 50 uF were delivered with a Gene Pulser II electroporator (Bio-Rad). 
The transfected Vero cells were allowed to recover for 10 min at room temperature and then incubated 
at 37°C for 2 to 4 days in a 75-cm? flask. Virus progeny were then passaged several times in Vero cells 
or primary human airway epithelial cells for 48 h to detect viable viruses. All viruses were maintained 
under biosafety level 3 (BSL3) conditions with redundant fans, and personnel used powered air-purifying 
respirators (PAPRs) and Tyvek suits. 

To detect leader-containing RNAs, intracellular RNA from wild type and recombinant MERS-CoV- 
Uganda (rMERS-CoV-Uganda) was reverse transcribed with a primer at the 3’ end of the genome and 
cDNA was isolated for PCR using a reverse primer located in ORF5 and a forward primer located in the 
leader RNA sequence at the 5’ end of the genome (5’-CTATCTCACTTCCCCTCGTTCTC-3’). Leader- 
containing amplicons were sequenced as previously described (64). The cDNA products were separated 
and visualized in 0.8% agarose gels. 

Viruses, cells, and infection. Wild-type and chimeric CoVs were cultured on Vero E6 cells, grown in 
Dulbecco modified Eagle medium (DMEM) (Gibco, Carlsbad, CA) and 5% fetal clone serum (HyClone, 
South Logan, UT) along with antibiotic-antimycotic (anti-anti; Gibco, Carlsbad, CA). Growth curves in Vero 
and primary human airway epithelial cells were performed as previously described (65, 66). Human lungs 
were procured under University of North Carolina at Chapel Hill Institutional Review Board-approved 
protocols. 

Biosafety and biosecurity. Reported studies were initiated after the NIH and the University of North 
Carolina Institutional Biosafety Committee approved the experimental protocol (project title, Generating 
Infectious Clones of Bat SARS-like CoVs; lab safety plan ID, 20167715; schedule G ID, 19982). 

Accession number(s). The near-complete genome sequence for PREDICT/PDF-2180 has been de- 
posited in GenBank under accession number KX574227. 
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